AITopics | minimax optimal regret

Collaborating Authors

minimax optimal regret

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

670c26185a3783678135b4697f7dbd1a-Supplemental.pdf

Neural Information Processing SystemsFeb-8-2026, 17:25:26 GMT

Our goal is to design algorithms that can automatically adapt to theunknown hardness of the problem,i.e.,thenumberofbestarms.

artificial intelligence, data mining, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.68)
Information Technology > Data Science > Data Mining > Big Data (0.46)

Add feedback

On Regret with Multiple Best Arms

Neural Information Processing SystemsOct-3-2025, 03:01:02 GMT

We study a regret minimization problem with the existence of multiple best/near-optimal arms in the multi-armed bandit setting.

artificial intelligence, data mining, machine learning, (20 more...)

Neural Information Processing Systems

Country:

North America > United States > Wisconsin (0.28)
Europe (0.28)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.69)
Information Technology > Data Science > Data Mining > Big Data (0.67)

Add feedback

86419aba4e5eafd2b1009a2e3c540bb0-Paper-Conference.pdf

Neural Information Processing SystemsAug-16-2025, 15:13:03 GMT

artificial intelligence, auction, machine learning, (19 more...)

Neural Information Processing Systems

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > New York (0.04)
(2 more...)

Industry:

Marketing (0.95)
Information Technology > Services (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Game Theory (0.99)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.70)

Add feedback

Nearly Minimax Optimal Regret for Multinomial Logistic Bandit

Neural Information Processing SystemsMay-27-2025, 15:46:58 GMT

In this paper, we study the contextual multinomial logit (MNL) bandit problem in which a learning agent sequentially selects an assortment based on contextual information, and user feedback follows an MNL choice model.There has been a significant discrepancy between lower and upper regret bounds, particularly regarding the maximum assortment size K . Additionally, the variation in reward structures between these bounds complicates the quest for optimality. Under uniform rewards, where all items have the same expected reward, we establish a regret lower bound of \Omega(d\sqrt{\smash[b]{T/K}}) and propose a constant-time algorithm, OFU-MNL, that achieves a matching upper bound of \tilde{\mathcal{O}}(d\sqrt{\smash[b]{T/K}}) . We also provide instance-dependent minimax regret bounds under uniform rewards.Under non-uniform rewards, we prove a lower bound of \Omega(d\sqrt{T}) and an upper bound of \tilde{\mathcal{O}}(d\sqrt{T}), also achievable by OFU-MNL . Our empirical studies support these theoretical findings.

artificial intelligence, machine learning, multinomial logistic bandit, (9 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.69)
Information Technology > Artificial Intelligence > Machine Learning (0.67)

Add feedback

Nearly Minimax Optimal Regret for Learning Linear Mixture Stochastic Shortest Path

Di, Qiwei, He, Jiafan, Zhou, Dongruo, Gu, Quanquan

arXiv.org Machine LearningFeb-14-2024

We study the Stochastic Shortest Path (SSP) problem with a linear mixture transition kernel, where an agent repeatedly interacts with a stochastic environment and seeks to reach certain goal state while minimizing the cumulative cost. Existing works often assume a strictly positive lower bound of the cost function or an upper bound of the expected length for the optimal policy. In this paper, we propose a new algorithm to eliminate these restrictive assumptions. Our algorithm is based on extended value iteration with a fine-grained variance-aware confidence set, where the variance is estimated recursively from high-order moments. Our algorithm achieves an $\tilde{\mathcal O}(dB_*\sqrt{K})$ regret bound, where $d$ is the dimension of the feature mapping in the linear transition kernel, $B_*$ is the upper bound of the total cumulative cost for the optimal policy, and $K$ is the number of episodes. Our regret upper bound matches the $\Omega(dB_*\sqrt{K})$ lower bound of linear mixture SSPs in Min et al. (2022), which suggests that our algorithm is nearly minimax optimal.

algorithm, inequality hold, minimax optimal regret, (12 more...)

arXiv.org Machine Learning

2402.08998

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

On Regret with Multiple Best Arms

Zhu, Yinglun, Nowak, Robert

arXiv.org Machine LearningOct-22-2020

We study a regret minimization problem with the existence of multiple best/near-optimal arms in the multi-armed bandit setting. We consider the case when the number of arms/actions is comparable or much larger than the time horizon, and make no assumptions about the structure of the bandit instance. Our goal is to design algorithms that can automatically adapt to the unknown hardness of the problem, i.e., the number of best arms. Our setting captures many modern applications of bandit algorithms where the action space is enormous and the information about the underlying instance/structure is unavailable. We first propose an adaptive algorithm that is agnostic to the hardness level and theoretically derive its regret bound. We then prove a lower bound for our problem setting, which indicates: (1) no algorithm can be minimax optimal simultaneously over all hardness levels; and (2) our algorithm achieves a rate function that is Pareto optimal. With additional knowledge of the expected reward of the best arm, we propose another adaptive algorithm that is minimax optimal, up to polylog factors, over all hardness levels. Experimental results confirm our theoretical guarantees and show advantages of our algorithms over the previous state-of-the-art.

artificial intelligence, data mining, machine learning, (20 more...)

arXiv.org Machine Learning

2006.14785

Country:

North America > United States > Wisconsin > Dane County > Madison (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York (0.04)
(2 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.87)

Add feedback

Minimax Optimal Algorithms for Adversarial Bandit Problem with Multiple Plays

Vural, N. Mert, Gokcesu, Hakan, Gokcesu, Kaan, Kozat, Suleyman S.

arXiv.org Artificial IntelligenceNov-25-2019

We investigate the adversarial bandit problem with multiple plays under semi-bandit feedback. We introduce a highly efficient algorithm that asymptotically achieves the performance of the best switching $m$-arm strategy with minimax optimal regret bounds. To construct our algorithm, we introduce a new expert advice algorithm for the multiple-play setting. By using our expert advice algorithm, we additionally improve the best-known high-probability bound for the multi-play setting by $O(\sqrt{m})$. Our results are guaranteed to hold in an individual sequence manner since we have no statistical assumption on the bandit arm gains. Through an extensive set of experiments involving synthetic and real data, we demonstrate significant performance gains achieved by the proposed algorithm with respect to the state-of-the-art algorithms.

algorithm, exp3, exp4, (15 more...)

arXiv.org Artificial Intelligence

1911.11122

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Republic of Türkiye > Ankara Province > Ankara (0.04)
Asia > Japan > Honshū > Chūbu > Nagano Prefecture > Nagano (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.71)

Add feedback